Updated on: 2022-11-29
In this project, my intention is to find out if using median returns in portfolio optimization could lead to better portfolio performance compared to mean returns. The use of mean returns might not accurately describe expected returns due to the non-normal distribution of financial returns. The mean is also known to be affected by outliers, whereas the median tend to be a more robust measure of central tendency.
I found that portfolios maximizing the median returns led to better diversification than maximizing the mean returns. Furthermore, minimizing the median absolute deviation resulted in a portfolio that has the highest return among portfolios minimizing the different risk measures. However, this project does not include an optimal portfolio selection which maximizes return for a given level of risk, or minimizes risk for a given level of return. The project does include a few other flaws, such as having only a small number of random portfolios generated for testing and not being able to show conclusively that the median measure is better than the mean.
library(doParallel) # For parallel computation in foreach loops
library(PortfolioAnalytics) # For portfolio optimization and analysis
library(RColorBrewer) # For color palettes in plots
library(tidyquant) # For quantmod and PerformanceAnalytics functions
library(tidyverse) # For dplyr and ggplot2 functions (data manipulation and plotting)
I retrieved the daily adjusted closing prices of 10 stocks from Yahoo Finance, starting January 2015 to June 2022. The daily returns are then calculated using the discrete/simple method to be used in calculating portfolio returns.
The 10 stocks used in this project are Procter & Gamble (PG), Walmart (WMT), Booking Holdings (BKNG), Salesforce (CRM), 3M (MMM), Starbucks (SBUX), Walt Disney (DIS), Home Depot (HD), Coca-Cola (KO), and NVIDIA (NVDA).
The daily adjusted closing price of the tickers can be retrieved
using quantmod::getSymbols()
.
# Vector of tickers to include in portfolio
<- c("PG", "WMT", "BKNG", "CRM", "MMM", "SBUX", "DIS", "HD", "KO", "NVDA")
tickers
<- as.Date("2015-01-01")
startdate <- as.Date("2022-07-01")
enddate
<- NULL
price_data
# Loop to get adjusted closing prices for all stocks
for(t in tickers) {
<- cbind(price_data,
price_data ::getSymbols(Symbols = t, src = "yahoo", auto.assign = FALSE,
quantmodfrom = startdate, to = enddate, periodicity = "daily") %>% Ad())
}
# Check dimension of object, start and end date of data collected
dim(price_data); start(price_data); end(price_data)
## [1] 1887 10
## [1] "2015-01-02"
## [1] "2022-06-30"
# See first 6 observations in price_data
data.frame(head(price_data))
Discrete returns can be calculated using
PerformanceAnalytics::Return.calculate()
.
<- na.omit(PerformanceAnalytics::Return.calculate(prices = price_data, method = "discrete")) %>%
return_data `colnames<-`(paste("R", tickers, sep = "_"))
dim(return_data); data.frame(head(return_data))
## [1] 1886 10
The (arithmetic) mean and median are two measures of central tendency that can be used to describe expected returns of a stock.
Since the distribution of stock returns tend to be non-normal, the
median may be a more appropriate measure. For example, the density plot
of PG
shows that the returns do not follow a normal
distribution. The Q-Q plot on the top-left corner also indicated a
non-normal distribution.
chart.Histogram(R = return_data$R_PG,
method = c("add.density", "add.normal", "add.qqplot"),
main = "Density Plot of PG Historical Returns")
legend(x = "topright", legend = c("Density Plot", "Normal Distribution"), lwd = 2, col = c("darkblue", "blue"))
The mean and median returns of each stock in the portfolio are:
<- apply(X = return_data, MARGIN = 2, FUN = mean)
stock_means
<- apply(X = return_data, MARGIN = 2, FUN = median)
stock_medians
data.frame(rbind(Mean = stock_means, Median = stock_medians))
Although the mean and median return values are quite small, we can notice a difference between the two measures.
The variance measures the squared deviation of returns from the mean, but standard deviation (SD) is used since it is in the same units as returns.
The SD of each stock in the portfolio is:
<- apply(X = return_data, MARGIN = 2, FUN = sd)
stock_sd
data.frame(rbind(SD = stock_sd))
If returns were normally distributed, we can use the 68-95-99 rule, where 68%/95%/99% of returns are within 1SD/2SD/3SD of the mean.
The mean absolute deviation (MAD) is the mean of the absolute deviations between returns and a central point. The central point usually refers to the mean, but the median can be used as well.
\(MAD = \frac{1}{N}\sum_{i=1}^N |R_i - m(R)|\), where \(R_i\) is the return of a stock, and \(m(R)\) is the mean or median return of the stock.
The MAD around the mean and median of each stock in the portfolio are:
<- apply(X = return_data, MARGIN = 2, FUN = function(x) {
stock_MAD mean(abs(x - mean(x)))
})
<- apply(X = return_data, MARGIN = 2, FUN = function(x) {
stock_MADmed mean(abs(x - median(x)))
})
data.frame(rbind(MAD_mean = stock_MAD, MAD_median = stock_MADmed))
We can see that the MAD of each stock is smaller than its SD as SD places more weight on outliers than MAD. The MAD calculated with the mean and median are quite similar. I would stick with using MAD around the mean as a measure of risk.
The median absolute deviation (also abbreviated as MAD, but for the purpose of distinguishing the two measures, I used MeAD instead) is the median of the absolute deviations between returns and its median. It is seen as a more robust measure of variability than MAD or SD.
\(MeAD = med |R_i - med(R)|\), where \(R_i\) is the return of a stock, and \(med(R)\) is the median return of the stock.
The MeAD of each stock in the portfolio is:
<- apply(X = return_data, MARGIN = 2, FUN = function(x) {
stock_MeAD median(abs(x - median(x)))
})
data.frame(rbind(MeAD = stock_MeAD))
The different return and risk measures based on mean and median are summarized below:
data.frame(rbind(Mean = stock_means, Median = stock_medians,
SD = stock_sd, MAD_mean = stock_MAD, MAD_median = stock_MADmed, MeAD = stock_MeAD))
Portfolio optimization using mean returns and variance/SD and MAD as risk measures are common and a simple Google search would return many research papers and articles on it. Using median returns instead of mean returns have also been widely researched. However, the use of MeAD as a risk measure in portfolio optimization does not seem to be documented. The project tested MeAD as part of the research, but it may be a spurious measure of risk as no statistical tests or simulations to find its significance were carried out.
Optimization based on median measures are usually not implemented by packages and functions, so I had to use a set of random hypothetical portfolios in this project.
Before optimizing any objectives, I generated a set of random portfolios which satisfy constraints where the sum of the component weights must be equal to 1 and the weight of each component is between 0% and 100% of the portfolio.
<- PortfolioAnalytics::portfolio.spec(assets = tickers)
portspec
# Sum of weights constrained to 1, can also specify as type = "full investment"
<- PortfolioAnalytics::add.constraint(portfolio = portspec,
portspec type = "weight_sum",
min_sum = 1, max_sum = 1)
# Weight of each portfolio component can vary between minimum of 0% and maximum of 100%
<- PortfolioAnalytics::add.constraint(portfolio = portspec,
portspec type="box",
min = 0, max = 1)
portspec
## **************************************************
## PortfolioAnalytics Portfolio Specification
## **************************************************
##
## Call:
## PortfolioAnalytics::portfolio.spec(assets = tickers)
##
## Number of assets: 10
## Asset Names
## [1] "PG" "WMT" "BKNG" "CRM" "MMM" "SBUX" "DIS" "HD" "KO" "NVDA"
##
## Constraints
## Enabled constraint types
## - weight_sum
## - box (long only)
set.seed(43594)
<- PortfolioAnalytics::random_portfolios(portfolio = portspec,
rand_port permutations = 50000,
rp_method = "sample",
eliminate = TRUE)
dim(rand_port); head(rand_port)
## [1] 4341 10
## PG WMT BKNG CRM MMM SBUX DIS HD KO NVDA
## [1,] 0.100 0.100 0.100 0.100 0.100 0.100 0.100 0.100 0.100 0.100
## [2,] 0.090 0.380 0.026 0.008 0.084 0.000 0.052 0.164 0.174 0.022
## [3,] 0.034 0.074 0.120 0.190 0.260 0.010 0.102 0.012 0.172 0.026
## [4,] 0.154 0.002 0.028 0.216 0.152 0.016 0.216 0.032 0.000 0.184
## [5,] 0.104 0.000 0.000 0.378 0.000 0.006 0.500 0.004 0.000 0.008
## [6,] 0.046 0.034 0.100 0.018 0.560 0.154 0.020 0.006 0.062 0.000
4,341 portfolio permutations were found, which will be used for the rest of this project.
I used two different optimization objectives in this project:
To make the different strategies more practical, I attempted to replicate a half-yearly re-optimization strategy using previous one year data. In this case, I would calculate the return and/or risk of the hypothetical portfolios with 2015 return data and implement the optimal weights on 2016H1. Then, I re-optimize the weights using data from 2015H2 to 2016H1 for 2016H2 and so on. The last re-optimization used 2021 data for 2022H1 since I only retrieved data up to 30 June 2022.
<- c("2015", "2015-07/2016-06",
opt_periods "2016", "2016-07/2017-06",
"2017", "2017-07/2018-06",
"2018", "2018-07/2019-06",
"2019", "2019-07/2020-06",
"2020", "2020-07/2021-06",
"2021")
<- c("2016-01/2016-06", "2016-07/2016-12",
ret_periods "2017-01/2017-06", "2017-07/2017-12",
"2018-01/2018-06", "2018-07/2018-12",
"2019-01/2019-06", "2019-07/2019-12",
"2020-01/2020-06", "2020-07/2020-12",
"2021-01/2021-06", "2021-07/2021-12",
"2022-01/2022-06")
data.frame(cbind(Optimization_Period = opt_periods, Return_Period = ret_periods))
The daily returns of hypothetical portfolios can be calculated using
opt_periods
and randport
and is based on the
formula \(R_p = \sum_{i=1}^N R_i w_i\).
I used geometric chaining to aggregate returns and rebalanced the
portfolios quarterly. We can use the performance of random portfolios in
each optimization period to select portfolio weights which optimized
that period’s return and/or risk.
<- foreach(i = 1:nrow(rand_port), .combine = "cbind") %do% {
rp_returns <- PerformanceAnalytics::Return.portfolio(R = return_data,
tmp weights = rand_port[i, ],
geometric = TRUE,
rebalance_on = "quarters")
}
I also calculated the returns of an equal-weight portfolio to compare the performance of optimized portfolios.
# If do not include weights, equal weight portfolio is assumed
<- PerformanceAnalytics::Return.portfolio(R = return_data,
ewp_return geometric = TRUE,
rebalance_on = "quarters")
In this section, I find portfolios that maximize mean and median returns in each optimization period, although these types of objectives may not be practical in portfolio optimization. It assumes that investors create their portfolios based on the best historical returns. However, these strategies can give an idea of the risks taken by an investor, based on the drawdown of the portfolios.
Find weights that maximizes mean portfolio returns in each optimization period:
<- foreach(i = opt_periods, .combine = "rbind") %do% {
maxmean_weight <- PerformanceAnalytics::Mean.arithmetic(x = rp_returns[i, ])
tmp
<- rand_port[which.max(tmp), ]
opt_weight
}
rownames(maxmean_weight) <- paste("OP", 1:nrow(maxmean_weight), sep = "")
data.frame(maxmean_weight)
Calculate return based on the selected portfolio weights in the return period:
# Returns of best mean portfolio in ret_period using weights from opt_period
<- foreach(i = ret_periods, j = 1:nrow(maxmean_weight), .combine = "rbind") %do% {
maxmean_returns <- PerformanceAnalytics::Return.portfolio(R = return_data[i, ],
tmp weights = maxmean_weight[j, ],
geometric = TRUE,
rebalance_on = "quarters")
}
Find weights that maximizes median portfolio returns in each optimization period:
# Find weights that maximizes median of each optimization period
<- foreach(i = opt_periods, .combine = "rbind") %do% {
maxmed_weight <- apply(X = rp_returns[i, ], MARGIN = 2, FUN = median)
tmp
<- rand_port[which.max(tmp), ]
opt_weight
}
rownames(maxmed_weight) <- paste("OP", 1:nrow(maxmed_weight), sep = "")
data.frame(maxmed_weight)
Calculate return based on the selected portfolio weights in the return period:
# Returns of best median portfolio in ret_period using weights from opt_period
<- foreach(i = ret_periods, j = 1:nrow(maxmed_weight), .combine = "rbind") %do% {
maxmed_returns <- PerformanceAnalytics::Return.portfolio(R = return_data[i, ],
tmp weights = maxmed_weight[j, ],
geometric = TRUE,
rebalance_on = "quarters")
}
Plot weights of best mean and best median portfolios:
chart.StackedBar(w = maxmean_weight, colorset = RColorBrewer::brewer.pal(n = 10, "Spectral"),
main = "Optimal Weights of Best Mean Portfolio", ylab = "Weight")
chart.StackedBar(w = maxmed_weight, colorset = RColorBrewer::brewer.pal(n = 10, "Spectral"),
main = "Optimal Weights of Best Median Portfolio", ylab = "Weight")
Plot cumulative return of best mean and best median portfolios against equal-weight portfolio return:
<- cbind(maxmean_returns, maxmed_returns, ewp_return["2016/",]) %>%
best_return `colnames<-`(c("Best_Mean", "Best_Median", "Equal Weight"))
chart.CumReturns(R = best_return, geometric = TRUE,
legend.loc = "topleft",
main = "Cumulative Return of Best Return Portfolios")
chart.Drawdown(R = best_return, geometric = TRUE,
legend.loc = "bottomleft",
main = "Drawdown of Best Return Portfolios")
Tables of annualized returns and risk measures:
table.AnnualizedReturns(R = best_return, scale = 252, Rf = 0.03/252, geometric = TRUE)
table.DownsideRisk(R = best_return, scale = 252, Rf = 0.03/252, MAR = 0.08/252)
The Best Median portfolio had a lower annualized return and standard deviation than the Best Mean portfolio. Furthermore, it resulted in better diversification than the Best Mean portfolio, which was more concentrated in a single stock (mainly in NVDA). Therefore, using the median return in optimization may allow an investor to achieve a lower risk than using the mean return.
On the opposite spectrum, I find portfolios that minimizes risk in each optimization period in this section.
Find weights that minimizes variance/standard deviation in each optimization period:
<- foreach(i = opt_periods, .combine = "rbind") %do% {
minstd_weight <- PerformanceAnalytics::StdDev(R = rp_returns[i, ])
tmp
<- rand_port[which.min(tmp), ]
opt_weight
}
rownames(minstd_weight) <- paste("OP", 1:nrow(minstd_weight), sep = "")
data.frame(minstd_weight)
Calculate return based on the selected portfolio weights in the return period:
<- foreach(i = ret_periods, j = 1:nrow(minstd_weight), .combine = "rbind") %do% {
minstd_returns <- PerformanceAnalytics::Return.portfolio(R = return_data[i, ],
tmp weights = minstd_weight[j, ],
geometric = TRUE,
rebalance_on = "quarters")
}
Find weights that minimizes MAD in each optimization period:
<- foreach(i = opt_periods, .combine = "rbind") %do% {
minmad_weight <- apply(X = rp_returns[i, ], MARGIN = 2, FUN = function(x) {
tmp mean(abs(x - mean(x)))
})
<- rand_port[which.min(tmp), ]
opt_weight
}
rownames(minmad_weight) <- paste("OP", 1:nrow(minmad_weight), sep = "")
data.frame(minmad_weight)
Calculate return based on the selected portfolio weights in the return period:
<- foreach(i = ret_periods, j = 1:nrow(minmad_weight), .combine = "rbind") %do% {
minmad_returns <- PerformanceAnalytics::Return.portfolio(R = return_data[i, ],
tmp weights = minmad_weight[j, ],
geometric = TRUE,
rebalance_on = "quarters")
}
Find weights that minimizes MeAD in each optimization period:
<- foreach(i = opt_periods, .combine = "rbind") %do% {
minmead_weight <- apply(X = rp_returns[i, ], MARGIN = 2, FUN = function(x) {
tmp median(abs(x - median(x)))
})
<- rand_port[which.min(tmp), ]
opt_weight
}
rownames(minmead_weight) <- paste("OP", 1:nrow(minmead_weight), sep = "")
data.frame(minmead_weight)
Calculate return based on the selected portfolio weights in the return period:
<- foreach(i = ret_periods, j = 1:nrow(minmead_weight), .combine = "rbind") %do% {
minmead_returns <- PerformanceAnalytics::Return.portfolio(R = return_data[i, ],
tmp weights = minmead_weight[j, ],
geometric = TRUE,
rebalance_on = "quarters")
}
Plot weights of minimum risk portfolios:
chart.StackedBar(w = minstd_weight, colorset = RColorBrewer::brewer.pal(n = 10, "Spectral"),
main = "Optimal Weights of Minimum Variance Portfolio", ylab = "Weight")
chart.StackedBar(w = minmad_weight, colorset = RColorBrewer::brewer.pal(n = 10, "Spectral"),
main = "Optimal Weights of Minimum MAD Portfolio", ylab = "Weight")
chart.StackedBar(w = minmead_weight, colorset = RColorBrewer::brewer.pal(n = 10, "Spectral"),
main = "Optimal Weights of Minimum MeAD Portfolio", ylab = "Weight")
Plot cumulative return of minimum risk portfolios against equal-weight portfolio return:
<- cbind(minstd_returns, minmad_returns, minmead_returns, ewp_return["2016/",]) %>%
min_risk `colnames<-`(c("Min_Var", "Min_MAD", "Min_MeAD", "Equal Weight"))
chart.CumReturns(R = min_risk, geometric = TRUE,
legend.loc = "topleft",
main = "Cumulative Return of Minimum Risk Portfolios")
chart.Drawdown(R = min_risk, geometric = TRUE,
legend.loc = "bottomleft",
main = "Drawdown of Minimum Risk Portfolios")
Tables of annualized returns, risk measures and statistics:
table.AnnualizedReturns(R = min_risk, scale = 252, Rf = 0.03/252, geometric = TRUE)
table.DownsideRisk(R = min_risk, scale = 252, Rf = 0.03/252, MAR = 0.08/252)
The equal-weight portfolio had the highest annualized return and standard deviation, compared to the minimum risk portfolios. While the Minimum MeAD portfolio performed the best among the minimum risk portfolios, it cannot be concluded that the MeAD should be used extensively in practice, given the lack of testing.
The results of this project would suggest that portfolios maximizing the median returns would lead to better diversification than mean returns. Furthermore, minimizing the median absolute deviation resulted in a portfolio that has the highest return among portfolios minimizing the different risk measures.
It should be stressed that the results are not generalizable as there were only 10 stocks chosen for this project and the random portfolios generated were just a small subset of an astronomical number of possible permutations. However, the results do coincide with this paper, which was able to find that median models provided benefits of portfolio diversification and returns. This shows that the median measure is useful in certain scenarios, and should be considered in portfolio optimization.
Frost, J. Mean Absolute Deviation: Definition, Finding & Formula. Statistics By Jim. Retrieved 28 July 2022, from https://statisticsbyjim.com/basics/mean-absolute-deviation/
Wikipedia. (2022). Average absolute deviation. Retrieved 29 July 2022, from https://en.wikipedia.org/wiki/Average_absolute_deviation
Wikipedia. (2022). Median absolute deviation. Retrieved 28 July 2022, from https://en.wikipedia.org/wiki/Median_absolute_deviation